perm filename APE.OLD[UP,DOC] blob
sn#038554 filedate 1973-04-27 generic text, type T, neo UTF8
STANFORD ARTIFICIAL INTELLIGENCE LABORATORY 27 Apr 1973
READING THE ASSOCIATED PRESS NEWS
By Martin Frost
ABSTRACT:
We have a line from the Associated Press (AP) over which we get
national and international news (no local news). The line is read by
a program that takes incoming news stories and files them away on the
disk, keeping about 24 hours' worth of news on file at any given
time. This document (which exists as the file APE.ME[UP,DOC])
describes usage of programs that allow access to the AP news.
The Associated Press news report is made available in these programs
for demonstration and research purposes only and caution must be
exercised to insure that the news is not published or broadcast or
publicly displayed or used for any commercial purpose.
27 Apr 1973 AP News Page 1
We have a line from the Associated Press (AP) over which we get
national and international news (no local news). The line is read by
a program that takes incoming news stories and files them away on the
disk, keeping about 24 hours' worth of news on file at any given
time. This document (which exists as the file APE.ME[UP,DOC])
describes usage of programs that allow access to the AP news.
For use in reading the news, there are two programs on the system.
The first of these is HOT, which is a very small program that simply
types out the stories as they come in. The second program is called
APE; by categorizing each story from a list of keywords, it enables
the user to selectively read the news on file.
To use the hot line, simply type the monitor command: R HOT. The
program should type back: "...Associated Press news..."; if it
doesn't, then it is having trouble contacting the program [-AP-],
which listens to the AP line. In this case, the program will try for
about thirty seconds to contact [-AP-], after which time it will give
up and tell you so. After "...Associated Press news..." is typed
out, you will get whatever news is coming in. There are times,
usually of only a few minutes duration, when no news is coming in; at
such times, HOT will of course type out nothing. WARNING: Typing
control-C or holding the typeout while the news is coming in will
probably cause HOT to miss some characters. If that happens, your
job number will be scratched from the list of jobs getting the
hotline news; so you will have to restart HOT.
Now before describing the second program (APE), I will explain a few
things about AP news stories. First of all, each story sent by the
AP has a sequence number which comes at the beginning of the story
and a date and time that come at the end. After the sequence number,
we insert the date and time (Pacific time) we received the story.
The sequence numbers start over every day, with the first story that
comes after about midnight EST getting number 001. Some special
stories (advance stories) are given sequence numbers out of the
normal order; these stories have numbers greater than 400. The time
at the end of each story is the approximate New York time when the
story was sent over the wire.
Every twelve hours (at about noon and midnight EST) there is a news
digest that summarizes the stories that are known to be coming in
over the next twelve hours. The digest at midnight is usually story
number 002 and is called the PMs digest; the one at noon is usually
number 202 and is called the AMs digest. No PMs digest is sent for
Sunday. The digests are not categorized by the our AP programs; to
access them you must use one of the two methods described in
paragraph 2 under SPECIAL FEATURES. Stories that have been
27 Apr 1973 AP News Page 2
mentioned in the latest digest bear the heading word "BJT" (for
"budget").
Each day there are many stories that are corrections or additions to
previous stories. We try to link up such a follow-up with the
original and treat the resultant combination as one story, although
it may be made up of two, three, or even more separately numbered
stories. Any attempt to retrieve with APE any story of such a group
will result in retrieval of all parts of the group in chronological
order. (Long stories are broken up into smaller parts by the
Associated Press; the smaller parts are called TAKES and each gets
its own sequence number. We try to link all takes of the same story
together just like additions and corrections.)
THE ASSOCIATED PRESS EXTRACTOR (APE)
The program that is used to retrieve news is called APE. It allows
quick access to the stories because of a data structure that is
continually being updated by other programs. As each story comes in
over the AP wire, it is categorized by keywords from a special list
(the keyword dictionary). For each keyword in the dictionary a list
is kept of all the stories that word occurs in. To access the news,
you select the keyword or combination of keywords that you wish to
read about. A keyword can be either a single word (or number) or a
sequence of words (and/or numbers). For example, the following are
some possible keywords: WELFARE, WAR, SAN FRANCISCO, UNITED STATES,
UNION OF SOVIET SOCIALIST REPUBLICS, PDP 10, etc.
The keyword dictionary contains about 1000 words, mostly people's
names and names of places (cities, states, countries). This list is
expandable, and if you have any words you would like added to the
list, SEND a note to ME. To see a list of the keywords, read the
file WORDS.SRT[AP,SYS].
All input lines to APE should be terminated with carriage returns.
Old APE users should note that this is a change from earlier
versions.
27 Apr 1973 AP News Page 3
KEYWORD EXPRESSIONS
To retrieve stories using APE, you type in a KEYWORD EXPRESSION,
which may be either a single keyword or an expression containing
keywords and the operators +, -, and *. Each keyword represents the
set of all the stories it occurs in. And the operators represent the
set operations UNION (+), INTERSECTION (*), and SET DIFFERENCE (-),
which are performed on the sets of stories which the keywords
represent. Thus, if you want all stories that mention both Nixon and
McGovern, you should type the keyword expression "NIXON*MCGOVERN".
The precedence of operators is the normal one: * takes precedence
over + and -, which have equal precedence. Operators with equal
precedence are evaluated from left to right. Parentheses may be used
freely in keyword expressions. Note that + - * are BINARY operators
only.
To clarify all this a little, here are a few examples:
Keyword Expression Meaning
--------------------------------- --------------------------------
(NIXON-WALLACE+MCGOVERN)*ELECTION All stories that mention both
ELECTION and either (1) NIXON
and not WALLACE or (2) MCGOVERN
(and possibly WALLACE).
ELECTION-NIXON-WALLACE-MCGOVERN All stories that mention
ELECTION but that mention none
of NIXON, WALLACE and MCGOVERN.
SAN FRANCISCO+LOS ANGELES-WAR All stories that mention either
SAN FRANCISCO or LOS ANGELES,
but not WAR.
Note: Spaces are needed only to separate individual words of multiple
word keywords, but they may be used anywhere except in the middle of
a word or special form.
27 Apr 1973 AP News Page 4
RUNNING APE
To run APE, type the monitor command: R APE. When APE starts up it
reads in various files, and it is possible that another program will
be writing one of these files. In that case, APE will say "One
moment please..." and will wait until it can read the file. After
all the files have been read in, APE will respond with
KEYWORD EXPRESSION:
You should then type in a keyword expression as defined in the
previous section. APE will count the stories that match your
expression and tell you how many stories it has found, such as:
5 news item(s) found. Selection:
At this point, you can select any contiguous group from the stories
found. For example, you can read the oldest 4 stories of those
matching your keywords, or you can read the newest 3, or the 2nd
through the 4th, or all of them, or none of them, etc. And you can
have the stories you select typed out, spooled (on the line printer)
and/or saved in a file on your disk area, all by typing in the
appropriate selection line as explained below.
The syntax for the selection line is as follows, where [...] denotes
an optional quantity and ...|... denotes exclusive alternatives: (The
order of different parts of the selection line is irrelevant except
that any filename must come first.)
[ <filenm> [/Q|/X] ← ] <story selection> [=] [S] [K] [W] [C|F|L|D]
<filenm> is a filename of up to 6 characters (no extension or PPN is
allowed). If the <filenm> term is present, the stories selected
will be saved in the given file. If you do not say either /Q or
/X after the filename, then if the file already exists, you will
be told so and asked what to do. The presence and end of the
<filenm> term is indicated by the left arrow (←). File output is
not allowed with the F or L options (see below).
/Q following the filename means replace file if it already exists.
/X following the filename means extend file if it already exists.
The <story selection> indicates which stories you wish to select from
those found. It also indicates in what order you want to read the
stories. The syntax for <story selection> is:
N | <nbr>[:<nbr>] | <empty>
27 Apr 1973 AP News Page 5
where <nbr> is a positive or negative integer and <empty> is the
empty string. If a single integer (k or -k) appears, it indicates
how many stories you wish to read. If the number is positive, you
will get the k most recent stories; if the number is negative, you
will get the k oldest stories. In either case, the stories will
come out in reverse chronological order, that is, newest stories
first. Two integers separated by a colon (:) indicate a range of
stories. For instance, "2:4" represents the second through the
fourth most recent stories. Negative numbers in this construction
represent the oldest stories; for example, "-2:-4" represents the
second oldest story through the fourth oldest story in that order.
The stories will come out in the order specified; that is, "-4:-2"
represents the same stories as "-2:-4" but in opposite order.
<empty> means select all the stories in reverse chronological order.
N means select None of the stories.
= means reverse the order in which the stories come out.
S means Spool the selected stories (not allowed with F or L options).
K means Kill automatic reading from command file (see section below
on command files).
W means type out the Words each story is categorized by.
C means Choose which stories get typed out completely (see below).
F means type out only the First few lines of each story.
L means type out only the Last few lines of each story.
D means Dont type out the stories at all (useful if you are saving
the stories in a file or spooling them).
If you use the Choose feature, then for each story the first few
lines will be typed out and you will be expected to indicate whether
you want to read the rest of the story. You will NOT be prompted at
this point; the typeout will simply stop, often in the middle of a
word. If you do not want to read the rest of the story, type just
carriage return. To read the rest of the story, type altmode,
linefeed, or any character (except "I") followed by carriage return.
If you don't want to read any more of the stories, type "I" and
carriage return. This has the same effect as [ESC] I followed by
carriage return (see paragraph 8 under SPECIAL FEATURES). The
character(s) you type will not be echoed, so the story will appear
unbroken. You will be allowed to quit reading a story at the
beginning of each part (take, correction, etc.) of the story. If you
are saving stories in a file or spooling them, then only those you
choose to read will be put in the file and/or spooled.
27 Apr 1973 AP News Page 6
Here are some selection line examples and their meanings.
2 Type out the newest two stories.
-2 Type out the oldest two stories.
=2 Type out the newest two stories in chronological order.
(Normal order is reverse chronological order.)
(Blank line.) Type out all the stories.
F2 Type only the first few lines of each of the two newest
stories.
-2:5 Type out the 2nd oldest story through the 5th newest
story.
=5:-2 Same as -2:5.
2:2 The only way to get just the 2nd newest story.
= Type out all the stories in chronological order.
FOO← Type out all the stories and save them in the file FOO.
FOO/Q←D Dont type out anything, but put all stories into the
file FOO. If the file already exists, then delete the
old version.
FOO/X←SC5 For the newest 5 stories, type out the first few lines
and let me Choose whether I want to see the rest of
the story. Extend the file FOO with any stories I
choose and then Spool it.
L Type out only the last few lines of each story. (The
last few lines include mainly the time and date of the
story.)
N Do Nothing with the stories found. (Get next keyword
expression.)
------------
If you ask for the stories to be saved in a file, the file will be
given the standard extension ".AP" and will be put on your own disk
27 Apr 1973 AP News Page 7
area (or your ALIAS area if you currently have an ALIAS). If you ask
for the stories to be spooled but not saved in a file, APE will
create a file with a name like $NEWS0.AP, which will be spooled and
then deleted. (The file $NEWS0.AP will be put on your real disk area
(NOT your ALIAS area) so that the spooler can delete it.)
SEARCHING THE NEWS FILE
While an expression is being read, if a keyword is encountered that
is not in the keyword dictionary, you will be told so and asked if
you would like a search done for that keyword in the news. If you
want a search done, type "Y" (and a carriage return) for Yes. Type
just carriage return if you dont want a search.
During a search, every time a story is found containing the searched
for keyword, an asterisk (*) will be typed out. Should you wish to
discontinue the search at any time, type a carriage return or, on
Stanford displays, [ESC] I. Any stories found up to that time will
represent the particular keyword in the expression as if searching
had gone to completion. Stories are searched in the order of newest
to oldest. For every keyword not in the dictionary, a separate
search must be done. However, once you have said Yes to searching,
subsequent keywords in the same expression will be searched for
automatically without your being asked. You may, of course,
interrupt such a search (by typing a carriage return or [ESC] I).
Multiple word keywords may be searched for just like single word
keywords, but only those instances where the whole multiple word
keyword occurs on the same line in the news will be found (this is
the result of an important search optimization).
Searching the whole news file for a keyword takes about 8 to 10
seconds of computer time. If, however, an unrecognized keyword
occurs as the SECOND part of an intersection or difference operation
(eg, NIXON * JJJJ or NIXON - JJJJ), then only the necessary stories
are searched and the search time is generally very much smaller.
27 Apr 1973 AP News Page 8
COMMAND FILE INPUT
If, in place of a keyword expression, you type an at-sign (@)
followed by a file name (extension and/or PPN allowed!), then APE
will endeavor to read a keyword expression and then possibly a
selection line from the file. APE can handle most (if not all) text
file formats, including SOS and E/TV. After you have opened a
command file in this manner, if you type just an at-sign for a
keyword expression, APE will read another keyword expression (and
selection line) from the command file. This can continue until the
end of the file is reached, at which time APE will type out [EOF] to
let you know.
If you follow the at-sign in either case above with an exclamation
point (!), then APE will automatically read from the command file
whenever a keyword expression is needed. This automatic reading from
the command file can be stopped by using the K option in the
selection line (see above). This cancels the effect of the
exclamation point. Whenever the selection line is read from the
command file (see below), however, you don't get a chance to type the
K. If you use the system REEnter command or type [ESC] I while
stories are being typed out (see paragraph 8 under SPECIAL
FEATURES), or if you type I<crlf> when choosing stories (see the
Choose option for the selection line on page 5), then automatic
command file reading will be turned off. Any error in an expression
read from the file will also turn off automatic reading.
If you type an at-sign (and optionally an exclamation point) without
a filename at a time when you have no command file open, then the
standard command file name APE.CMD will be assumed. You may open the
file APE.CMD on someone else's disk area by typing, for example,
"@[FOO,BAZ]" or "@![FOO,BAZ]".
Now a word about how command files are interpreted.
When reading from a command file, APE reads until a semicolon (;) or
comma (,) is found. All carriage returns, linefeeds and form feeds
(page marks) are completely ignored. (That means a keyword can be
split between two lines or even two pages!) If, when reading a
keyword expression from a file, a comma is encountered, then the
stuff following the comma and up to the next comma or semicolon is
assumed to be the selection line you want for this particular keyword
expression. On the other hand, if a keyword expression is terminated
with a semicolon, the selection line will be read from the console
instead. Selection lines in a command file should end with a
27 Apr 1973 AP News Page 9
semicolon. If one ends with a comma, everything up to the next
semicolon will be ignored.
Every keyword expression and selection line read from a command file
will be typed out preceded by an at-sign (@) to indicate that it came
from the file.
Finally, whenever an unrecognized keyword is read from a command
file, it is automatically searched for without your being asked. You
can, of course, always interrupt the search (by typing a carriage
return or [ESC] I).
Here is a sample command file:
#2+#202,1; TELEVISION+TV; MOVIES,; THEATRE,C; STAGE,STAG/X←C;
This file contains five keyword expressions. The first one will
cause the latest digest (number 2 or 202) to be typed out. (See
paragraph 2 under SPECIAL FEATURES below.) Next, if any stories
about TELEVISION or TV are found, the user will be allowed to type in
his own selection line. Then, if any stories about MOVIES are found,
they will automatically be typed out (note the empty selection line
between the comma and the semicolon). If any stories about THEATRE
are found, the user will be allowed to choose which ones he wants.
If any stories about STAGE are found, the user will be allowed to
choose which ones he wants, and those he picks will be added to the
file STAG.
27 Apr 1973 AP News Page 10
SPECIAL FEATURES
1. Whenever APE is expecting input, if you type a question mark (?)
and carriage return, you will be given some help regarding what you
are to type in.
2. In addition to normal English keywords, there are two special
forms that can be used as keywords in expressions. The first
consists of a period (.) followed by an unsigned integer, eg., ".18";
if k is the integer following the period, this form represents the
newest k stories that have come in. The second special form consists
of a number sign (#) followed by an unsigned integer, optionally
followed by a colon and another unsigned integer. The form #k
represents all the stories that have k as their AP sequence number;
the form #k:m represents all the stories with sequence numbers from k
to m (wrapping around if k>m). Using one of these forms is the only
way to get the AP news digests because the digests are not
categorized at all. (Actually, stories #1, #2, #201 and #202 are the
ones not categorized; occassionally the digest has some other
sequence number so it gets categorized.) Here are some examples of
keyword expressions using these special forms.
CHESS * .10 Among the last 10 stories that
have come in, all those that
mention CHESS.
#2 + #202 All stories with either of these
sequence numbers. (These are the
usual sequence numbers of the
news digests.)
#325:23 All stories with sequence number
greater than or equal to 325 or
less than or equal to 23.
3. Typing just CARRIAGE RETURN for a keyword expression (the null
keyword expression) has a special effect; it gives you back the
stories corresponding to the previous keyword expression. These
stories constitute your CURRENT STORY LIST. With this feature you
can get back a second time the stories you just looked at. In fact,
this feature can be used consecutively any number of times, giving
the same stories every time.
4. A keyword expression may be continued over several lines. Simply
type a LINEFEED anywhere except in the middle of a word and APE will
27 Apr 1973 AP News Page 11
type a carriage return and a colon (:) and wait for you to type in
more of the expression. A space is substituted for the linefeed.
5. Your current story list can be modified without typing again the
keywords you used to get it. If a keyword expression starts with +,
-, or *, the missing (first) operand is taken to be your current
story list. For example, if you have typed in "NIXON" as your last
keyword expression, you can type in "*VIETNAM" as your next
expression and you will get only stories that mention both NIXON and
VIETNAM.
6. When stories are typed out or written in a file, a row of stars
(*'s) is placed between stories. Note that corrections and additions
to a story are considered part of that story; thus they will not be
separated from it by a row of stars.
7. If you type control-O ([ESC] O on Stanford displays) during
typeout of a story, the typeout will be stopped (as usual), but will
start up again with the next story (if any).
8. While APE is typing out and/or filing stories, if you type [ESC] I
(Stanford displays only), or if you type control-C and then the
system REEnter command, APE will be back to asking for keywords, and
your current story list will not have been changed. (That is, you
can get it back by typing just carriage return; see paragraph 3
above.) Automatic command file reading is turned off when you do
this. Also, any file or spooler output going on is undone.
9. Upper and lower case characters are always equivalent.
10. If a keyword expression is preceded by a dollar sign ($), APE
will interpret that to mean that you wish to be notified whenever a
story comes in that fits the expression. (You will still be told how
many stories currently fit the expression.) Whenever such a story
does come in, a message will be sent which you will get the next time
log in or R LOGIN. The message will say something like this:
FOUND (VIETNAM*PEACE)
IN STORY #321 1019pt 07-04
where the time and date are those (pt=Pacific Time) that appear at
the beginning of the story. Also, if you are logged in at the time
the story comes in, the message
*** AP STORY FOUND ***
will be typed out on your console. Notification is on the basis of
your logged in programmer name; however, programmer names 'GUE' and
'SYS' cannot use automatic notification because there are many people
using each of these names. Also, notification requests cannot
contain search strings (unrecognized keywords) although this will
27 Apr 1973 AP News Page 12
probably change in the not too distant future. Every notification
request will expire eventually. The current plan is to purge a
request after it has existed for two months. Whenever one of your
requests expires, you will be sent a note like this:
YOUR REQUEST (VIETNAM*PEACE)
EXPIRED BEFORE STORY #321 1019 07-04
Automatic notification (AN) is intended to be used for two main
purposes. 1) If you are expecting an urgent story to come in at any
moment, and you want to be notified as soon as it comes in (assuming
you are logged in), automatic notification saves you the trouble of
running APE every half hour to find out if your story has come in.
2) If you are expecting a story to come within a couple of months,
but you don't know exactly when, then AN saves you the effort of
running APE every day, if you wouldn't otherwise do so.
If you find you are being notified about the same kind of story
several times a day, and if the stories are not particularly urgent,
then you will probably find that the normal use of APE, possibly
using a command file (see previous section) will be more convenient.
Also, the more AN requests there are, the more work the continually
running special AP programs have to do. However, you are free to
choose the method of using APE that best fits your purposes.
One final note on AN: When you get a hit from an AN request, the best
way to use APE to read the story is to type in the expression
(possibly using a command file) that got the hit. Alternatively, you
can type in the sequence number of the story found, but this is
liable to give you an extra story with the same sequence number. You
can combine these two methods and type something like (say)
"#35*CHESS", if CHESS was the AN request getting a hit on story #35.
11. If you type in a keyword expression that consists solely of a
dollar sign ($), then all notification requests you have in will be
typed out with their expiration dates.
12. If you enter the keyword expression "$$", then you will be
permitted to delete any of your notification requests.
27 Apr 1973 AP News Page 13
NOTES
First, the news is kept in a fixed size file. This means that old
stories are continually being deleted to make room for new ones. If
this happens after you start APE, and if you attempt to read such a
deleted story, then you will get a message something like "1 OF THE
STORIES WENT AWAY--SORRY".
Finally, news that comes in after you start APE cannot be retrieved.
If you want to update APE's data to include the latest stories, type
control-C and then the system START command. (When you do this your
current story list will be re-initialized to null.)